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Abstract 

Assume that we observe a large number of signals, all of them with identical, although unknown, 
^~ shape, but with a different random shift. The objective is to estimate the individual time shifts and 

O their distribution. Such an objective appears in several biological applications Hke neurosc.ence or 

ECG signal processing, in which the estimation of the distribution of the elapsed time between 



repetitive pulses with a possibly low signal-noise ratio, and without a knowledge of the pulse shape 
is of interest. We suggest an M-estimator leading to a three-stage algorithm: we first split our data 
set in blocks, then the shift estimation in each block is done by minimizing a cost function based 
on the periodogram; the estimated shifts are eventually plugged into a standard density estimator. 
We show that under mild regularity assumptions the density estimate converges weakly to the true 

-4— i shift distribution. The theory is applied both to simulations and to alignment of real ECG signals. 

"^ The proposed approach outperforms the standard methods for curve alignment and shift density 

estimation, even in the case of low signal-to-noise ratio, and is robust to numerous perturbations 



common in ECG signals. 



CN| Index Terms 

^^ semiparametric methods, density estimation, shift estimation, ECG data processing, nonlinear 

OO inverse problems. 

o 
> 

■^ I. Introduction 

C^ We investigate in this paper a specific class of stochastic nonlinear inverse problems. We observe 

a collection of M + 1 uniformly sampled signals in a finite interval [0, T] 

Vjiti) = s{ti -9^) + aSjiU), ti G [0,r], i = 0. . .M (1) 

where s is an unknown signal, {9j,j = . . . M} are independent real-valued continuous ran- 
dom variables with common probability density function / which represent a shift parameter, and 
Eq, . . . , Em are independent standard white noise processes with standard deviation a and independent 
of 6*0, ... , 9m- Our aim is to estimate either {%, j = . . . M}, or the shift distribution /. Similar 
models appear commonly in practice in numerous fields. For instance, a common problem in functional 
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data analysis (FDA) is to align curves obtained in a series of experiments with varying time shifts, 
before extracting their common features; we refer to |[1] and ||2l for an in-depth discussion on the 
problem of curve alignment in FDA applications. In data mining applications, after splitting the 
data into different homogeneous clusters, observations of a same cluster may differ. Such variations 
take into account the variability of individual waveforms inside one given group. In the framework 
described by ([!]), the knowledge of the translation parameter 9, and more specifically of its distribution, 
can be used to determine the inner variability of a given cluster of curves. Several papers (e.g. Q, 
Bl . ||5l . ||6l , iTTll ) focus on this specific model for many different applications in biology or signal 
processing. Such a problem can also be related to curve alignment problems as in |[T1, which 
are typically encountered in medicine (growth curves) and traffic data. Many methods previously 
introduced rely on a preliminary estimation of s, thus introducing an additional error in the estimation 
of {9j,j = . . . M}. For example, ||6l proposed to estimate the shifts by aligning the maxima of the 
curves, their position being approximated by the zeros of a kernel estimate of the derivative. Similar 
discussions can be found in recent contributions in the system identification framework, e.g. [81, ||9l, 
ifTOl . In particular, |i9| provides a two-stage algorithm which estimates jointly a parametric component 
and a functional. Since we do not rely on any information or estimation regarding s in this paper, 
it is of interest to consider it as a nuisance parameter, and the shifts {9j,j = . . . Af} (or /) as a 
parameter of interest, and consider ([T} as a semiparametric model as described in lOTI . However, if 
one is interested in s while having estimates of the shifts ^i, . . . , ^a/, one can easily proceed and use 
s{t) = M~^ ^7=0 yj(^ + ^i) ^^ ^^ estimate of the signal s. 

Our contribution is close to recent shift estimation techniques described in | fT2l and |fT3l|. Both rely 
on the fact that the spectral density of one given signal remains invariant by shifting, and therefore, it 
is well fitted for semiparametric methods when s is unknown. In llT2ll . the problem addressed is the 
joint estimation of K shifts parameters, when if is a fixed number of curves (unlike what is done 
in the current paper where K — )• oo). This leads to an efficient semiparametric estimation technique, 
similar to the papers of llT4l . ifTSll . The advantage of such an estimator is that it is statistically efficient, 
consistent and asymptotically normal. However, when the number of curves to process is important, 
the method leads to a computationally intensive optimization problem. It is therefore of interest in 
practical applications to deal with blocks of smaller size which include one identical reference curve, 
as done in this paper. In |[T3l . the authors estimate the shift probability density function when the 
number of curves is infinite, but the corresponding alignment procedure is performed one curve after 
the other, by means of the minimization of a penalized likelihood function. Such an approach makes 
sense when we have a few curves to compare, but when we dispose of many signals, the shifts 
parameters may be estimated jointly and more efficiently. 

In our main application, we analyze ECG signals. We aim at situations where the heart electrical 

August 16, 2010 DRAFT 



IEEE TRANSACTIONS IN SIGNAL PROCESSING 3 

activity remains regular enough in the sense that the shape of each cycle remains approximately 
repetitive, so that after prior segmentation of the ECG recording, the above model still holds. This is 
the case for heart malfunctions such as sinus or supraventricular tachycardia, as mentioned in [16|. 
This preliminary segmentation can be done efficiently, for example, by taking segments around the 
easily identified maxima of the QRS complex, as it can be found in , or by means of digital filters 
as suggested in ifTTll . It is of interest to estimate {Oj,j = . . . M}, in ([T]l, since these estimates can be 
used afterwards for a more accurate estimation of the heart rate distribution. Another measurement 
often used by cardiologists is the mean ECG signal. A problem encountered in that case is that 
improperly aligned curves can yield an average on which the characteristics of the ECG cycle are 
lost. The proposed method leads to a more efficient estimation of the mean cycle by averaging the 
segments after an alignment according to a well-estimated shift. 

The paper is organized as follows. Section |ll] describes the assumptions made and the method to 
derive the estimators of the shifts and of their distribution. This method is based on the optimization 
of a cost function, based on the comparison between the power spectrum of the average of blocks of 
curves and the average of the individual power spectra. Since we consider a large number of curves, 
we expect that taking the average signal will allow to minimize the cost criterion consistently. We 



provide in Section III theoretical results on the efficiency of the method and on the weak convergence 
of the density estimate. In Section |IV| we present simulations results, which show that the proposed 
algorithm performs well for density estimation, and study its performances under different conditions. 
We also applied the methodology to the alignment of ECG signals, and show that the proposed 
algorithm outperforms the standard FDA methods. Proofs of the discussed results are presented in 
the appendix. 

II. NONPARAMETRIC ESTIMATION OF THE SHIFT DISTRIBUTION 

In this section, we state the main assumptions that will be used in the rest of the paper, and propose 
an algorithm which leads to an M-estimator of the shifts. Using these estimators, we obtain a plug-in 
estimate of the shift probability density function. 

A. Assumptions 

Assume that we observe M + 1 sampled noisy signals on a finite time interval [0,T], each one 
being shifted randomly by 6; a typical signal is expressed by 

Vjik) = s{ti - 9j) + aej{ti), 

(i - 1)T ^^^ 

^ - ^ ^ i = l...n, j = 0...M, 



n 
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where the processes {ej,j = . . . M} are assumed to be standard Gaussian white noises, and the 
variance cj^ is assumed to be constant. We also assume that the whole signal is within the samphng 
frame, which can be formalized by the following assumption: 

(H-1) The distribution of 6 and the shape s both have bounded non-trivial support, [0,Tq] and 
[0, Ts], respectively, and Te + Ts < T. 



As pointed out in 11181 . under this assumption we can consider s as a periodic function with associated 
period T. Without any loss of generality, we further assume that T = 27r in order to simplify notations. 
We also assume: 

(H-2) s G L'^{[0,Ts]) and its derivative s' G L°°. 



Assumption (H-1) implies that we observe a sequence of identical curves with additive noise, so 



that the spectral information is the same for all curves. Assumption (H-2) is critical to guarantee the 
existence of the Energy Spectral Density (ESD) of the studied curve and of the terms appearing in 
later sections. Note that the boundedness of the derivative is assumed for the sake of convenience (in 
order to show easily that the discretization error in the later parts can be neglected); the proposed 
method would also give good results on curves showing discontinuities. We finally make the following 
assumptions on the random variables appearing in Q: 

(H-3) The shifts {Oj,j = 0. . . M} are continuous random variables, independent and identically 
distributed with common probability density function / which is assumed to be uniformly 
bounded. We also consider the first shift 9q as known, and without loss of generality we 
fix ^0 = 0. Finally, we assume that the variables {£j{ti),j = 0, . . . , M, i = 1, . . . , n} are 
standard normal independent random variables, which are also independent of {Oj,j = 
0...M}. 

B. Computation of the shift estimators and of their density 

The intuitive idea of the proposed algorithm is as follows. Assume, for the sake of the argument, 
that (T = 0; then, when the shifts are known and corrected, the individual signals are equal to their 
average. Consequently, the average of their ESDs is equal to the ESD of the mean signal. On the other 
hand, if the shifts are not corrected, then the average signal is a convolution of the original shape 
with the shift distribution, and hence its ESD is strictly different from the average of the individual 
ESD's. 

Following the method of lfT3l . we propose to plug estimators of {^j, j = 1 . . . M} into an estimate 
of /. We start by splitting our dataset in N blocks of i^+1 curves each, as shown in Figure [T] Observe 
that 2/0 is included in each block, since all the rest of the signals are aligned with it. The motivation 
to spht the dataset into smaller blocks is twofold: it reduces the variance of the estimators of the 
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Fig. 1. Split of the curves data set 
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Block jV 



shifts by estimating them jointly, and also provides smooth functions for the optimization procedure 
detailed in this section. The first step is therefore to estimate the vectors of shifts {9m, m = 1 . . . N}, 
where for all integer m, 6.^ = {0(_rn-i)K+i, ■■■, OmK)- 

The estimation of 6„i is achieved by minimizing a cost function. For any continuous-time and 
27r -periodic signal y, we denote by Sy its energy spectral density, that is for all uj: 



Sy{u) 



1 

2^ 



27r 



2/(t)e-*'^* dt 







(3) 



This quantity is of interest, since it remains invariant by shifting. For each integer ?n, = 1 . . . A^, we 

A 



define the mean of K signals translated by some correction terms Q.m 

ym{t;ctm) 

A 1 



[»{m-l)K+l, ■ ■ -lOimK)- 



(4) 



mK 



K + \ 



Xyo{t)+ Yl yiit + ^i)]^ 

l={m-l)K+l 



where A = X{K) is a positive number which depends on K, and is introduced in order to give more 
importance to the reference signal yo- For any m, = 1, . . . , A^ we now consider: 



M + i^^y 

1=0 



S.. 



y„(-;a™) 



(5) 



The function described in Q represents the difference between the mean of the ESDs and the ESD 
of the average signal of the m-th block. Since the observed signals are sampled, the integral of Sy 
will be in practice approximated by its Riemann sum, that is 



Sy{k) 



1 " 

- Y] yiir. 



'2\iTmk / n 



where /C = { 



in=l 

n — 1 n — 3 



keIC 
n — 1 
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Let the sequence Cm{cirn) = {Cm{k, ocm) '■ k G /C} be defined by 



1 ^ . 
Cm{k,am) = TT—T^^yXk) - Sy^^^..a^){k) , (6) 

1=0 

and let {v^, A: € /C} be a sequence of nonnegative numbers such that u^k = t^k and ^^ k^Uk < oo 
when n tends to infinity. The proposed M-estimator of 6m is denoted by 6m and is given by 

6m= Argmin \\Cm{am)\\l, (7) 

Q;„G[0;27r]^^ 

where \\Cm{am)\\l = Y.keic^k\Cm{k,Ckm)\'^- 

Remark 2.1: As aforementioned, all the blocks have one curve yo in common. We impose this 
constraint in order to address the problem of identifiability. Without this precaution, replacing am by 
ctm + (c, c, . . . , c), c G M and the signal s(-) by s{- — c) in the m-th block would let (|6]l invariant. 
Adding the curve yo in each block as a reference allows to estimate the shifts with respect to a same 
common reference. 

The estimator of the probability density function /, denoted by fM^h, is then computed by plugging 
the estimated values of the shifts in a known density estimator, such as the regular kernel density 
estimator [|191 . that is for all real x in [0; 2it]: 

where the kernel ^ is a nonnegative function integrating to 1 with a bounded derivative and h the 
classical bandwidth parameter of the kernel. In this paper we provide a proof of weak convergence 
of the empirical distribution function of {9j, j = 1 . . . M} under some mild conditions. More 



specifically, we shall get from Theorem 3.2 that fM,h{x) converges pointwise to f{x) when both 
M — )• cxD and n — )• oo. 

III. Theoretical aspects 

We provide in this section theoretical results on the estimators described in Q and (|8). We denote 

by Cs{k) the discrete Fourier transform (DFT) of s: 

1 ^ 
n ^-^ 

m=l 



and by /fc,/ the DFT of yi\ 



1 . . 
A,/ = - E yKtm)e-'''^"^'/" , A: G /C, / = . . . M . 

m=l 
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Let 6i = 6i + ei where |e„| < 7r/n and (9; G {ti, . . . , in}- Using this notation, relation S becomes in 
the Fourier domain for all A; G /C and / = . . . M: 

1 " 

n ■^^ 

m=l 

+ ^ (I4,i + iVFfc,0 



e-i^e^iys(t^-eOe-2-™'^/" (9) 

n ^ — ^ 



n 

rn=l 



a 



+ ^ (yfc,i + iWk, 



n 



-ike, ..N . ^/. -Is cr 



'c,(A:) + 0(A:n-i) + -^ {Vk,i + iWk,i) , 

where the last equality stems from Taylor-Lagrange inequality due to |(H-2)| By the white noise 
assumption (H-3) the sequences {Va,.;, A; G /C} and {M^fc,;, k G /C} in (|9]l are independent and 



identically distributed with same standard multivariate normal distribution J\fn{0,ln)- The 0{kn^^) 
term is a result of the sampling operation and is purely deterministic; since it is assumed that 
J2k ^"^^k < oo, the contribution of this deterministic error to the cost function shall be no more 
than 0(n^^), and will further on be neglected since it is not going to induce shift estimation errors 
greater than the length of a single bin (i.e. n~^), while it will be shown that the statistical estimation 
error is Op(n^^/^). 



Note that Assumption (H-3) might be considered too strong, and we indeed state it for simplicity. 
It can be weakened to include more general random variables £j{ti), . . . ,£j{tn), as long as the 
Central Limit Theorem can be applied in (|9]). The y/n term appearing in this equation should be 
then understood as the normalization constant of the mean error in the /c-th tap. In particular the 
homogeneity of the noise distribution is not needed, and some dependency may be permitted as 
long as the sequence remains with some mixing property. The process should essentially be such 
that for any < a < 6 < T, maxjVar((& — a) X]a<t <b^i(*«) — '^n^'^ib — a) , where c„ — ;> 
and lim5_^o<7^((5) = 0. One important situation in which the error terms are not independent and 
identically distributed is when an adaptive sampling strategy is adopted, such that the acquisition of 
the observations is concentrated around interesting points of the signal. This discussion is beyond the 
scope of this paper. 

A. Heuristic argument and asymptotic expansion 

Before detailing the complete derivation of the estimate properties, we give in this section a heuristic 
argument which shows the consistency and of 0i in a simpler case. We assume for simplicity that 
M = i^ >> n — )• oo and that all the shifts {6j, j = . . . M} are equal to zero, so that cti represents 
only the error made during alignment and we only have one block to process. We also assume that the 
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signal s is an odd function, so that Cs{k) = ick is a non-zero imaginary number and there is no reason 
to align the curves with respect to yo- We take therefore without loss of generality X]m=o '^™ ~ ^• 
Define for all integers k and / : V^,/ = o"n~^''^Vk^i and Wk,i = an~^''^Wkj, and define the random 
variables Rk^i and Pk,i so that we can write Vk^i + iWk,i = Rk,ie^^'''- First observe that since 



1=0 1=0 



then Ci(k,ai) > due to Cauchy-Schwarz inequality. Since K tends to infinity, the mean energy 
spectral density, the first term on the right-hand-side (RHS) of ([6]) is approximately that equal to 

4 + OriK-^/^), so that: 



Ci{k,ai) = cl + Op{K 



-1/2^ 



1 ^ 



1=0 
.2 , r^_/t— 1/2N 



4 + OriK- 



K K 



-—2 ^ Y^ (cl cos{{ai - am)k) 



^ ' ' 1=0 m=0 
- '^CkRk,m Sm{{ai - am)k - /3k^rn) 

+ Rk,iRk,m cos{{ai - am)k + Pk^l - I5k,m)^ 

Expanding the harmonic functions up to op(n~^), assuming that the am = Op(n~^/^) (which shall 
be later proved in Theorem 3.1 1, and noting that Rk^m = Op(n^^/^), we get for any fixed k that 

K K 
- rj^ ,,-.2 y^ y^ Rk,lRk,m COs{{ai - am)k + /3k,l - f3k,m) 



1=0 m=0 
K K 



— ~ /T^ I i\2 / . / , Rk,lRk,m COs{Pk,l — l^k,m) 
^ ' 1=0 m=0 

K K 

"^ (K -\- n2 y^ y^ Rk,iRk,m sin(/3fc,; - Pk,m){ai - am)k + op(n"^) 

^ ^ 1=0 m=0 

K K 

^ ~ (K + 1)^ ^ ^ -Rfc,«^fc,m COs{Pk,l - h,m) + Op(n"^) . 
'^ '^ > 1=0 m=0 
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Since we assumed that J2'm=o '^™ ~ ^' ^^ obtain: 

K K 



(jy- ,-,-.2 y^ y^ 2ckRk,m sin((a/ - am)A; - I3k,m) 



Z=0 m=0 



T Z^ Rk,m sin(/3fc m) 



if +- 

m=0 

2A;cj. 






m=0 /=0 

2c ^ 



?Tt=0 

Using the same assumption, we get that 

K K 



K + . 

m=0 
,r. , -,N2 ZZ ^^^">- COs(/3fc,m)am + Op(n~^] 



^ ' 1=0 m=0 

p^2 K K 

~^^ ^ 2(K +\)2 X X ("' ~ "™)^ + °P(" ^) 

*^ '^ /=0 m=0 



Hence, the Taylor expansion of the cost function is equal to 

Ci(A;,q;i) 



K K 



(T^ ,-i\2 7 , 7 , Rk,lRk,m COs{f3k^l — Pk^r, 



l=Q m=0 

- T^ ,-, X ^'^•'" cos(/3fc,m)am - -jy—7 X ^'^'"^ Sin(/3fc,m) 

m=0 m=0 

m=0 

which is minimized by taking 6m = Rk,m cos{/3k,m)/kck + op(n~^/^) + Op(if ~^/^). More generally, 
when the different bands are weighted, we obtain by differentiation 

t>m — ^^^ r^-4 l-op(n ' ) + (Jp[K ' ) 

which establishes the asymptotic expansion (up to the first order) and the asymptotic normality of 
the estimate when both n and K tend to infinity. 
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B. Computation of the cost function Cm 

The cost function Cm associated with block m can be written as follows: 

\2 



\Cm{am)\\l = ^Vk {Auik) - Bm{k, Om)? 
keK. 

+ ^ Z^fc {Bm{k, Om) - Bmik, Qm))^ 

fce/c 
+ 2 ^ Z^fc (Bmik, 6m) - Bmik, am)) 



(10) 



fce/c 

X {AMik)-Bm{k,em)), 

where A]\f{k) and Bm{k, ctm) are the first and second terms of the RHS of ([6]), both taken at point 
k. We focus on the expansion of the terms associated with ||Ci(q;i)||^, since all other cost functions 
may be expanded in a similar manner up to a change of index. We detail the expansion of AM{k) 
and Bi{k,ai), since Bi{k,di) can be easily obtained from the latter term. 
Recall that Auik) = j^ Ef=o IfkA"^' we get that 

2 



1 ^^ 



e-""-c,(k) + ^{Vtj+iWtj) 



n 



1 ^i 2 

1=0 



+ ^Vk^iReie-'^'^Csik)) + ^Wk,ilm{e-"''' Cs{k))} 



/n \Jn 

Due to the equalities Re(e~^^'Cs(/c)) = cos(/i;6'i)Re(cs(A;))+sin(A;0/)Im(cs(A;)) and \m.{er^^^ Csik)) 
cos(fc^/)Im(cs(A;)) — sin(A;0i)Re(cs(/c)), it follows that 



2 ^ 

AM{k) = \csik)\' + ;^p^ E(^M + Ki) (11) 

+ ^t(Mfu T. (^^.' ^°«(fc^') - ^M ^i-(^^')) 



Remark 3.1: By Assumption (H-2)| and the law of large numbers the last two terms of (H) converge 



almost surely to as M tends to infinity. Moreover, the sum of the second term has a x^ distribution 
with 2(M + 1) degrees of freedom. Thus, the term AM{k) tends to |cs(A;)| + An^^a"^ as M — )• cxd, 
and therefore to Ss{k) as both M and n tend to infinity. 
The first curve of each block is the reference curve, which is considered to be invariant and thus has 
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11 



a known associated shift, so that ao = 9q = 6q = 0. It stems from ([4]) and Q that 



Bi{k,ai) 



1 



X + K 



a 



\{cs{k) + ^{Vk,o + -^Wkfl)) 



K 



+E «' 



}k{ai-9i) 



a 



c,{k) + —e'^''^{Vk,i + \Wk,i] 
'n 



1=1 
thus, if we define A^, m = . . . K, such that Aq = A and Am = 1 otherwise: 

Bi{k, ai) 



(K + A)2 






K 



X 



E^^ 



\m=0 

and expanding the latter yields 

Bi{k,ai 



,iA:(6»„-a„) 



O" 



n 



-ika„ 



(V, 



k,m 



iWi 



k,m, 



\csik)\' 



K 

^ ' Lm=0 



Jk{ai-ei-a^+9„,) 



K 



n{X + K)^ 



Y, AA™{e''("'-"" 



l,m=0 



[Vk,lVk,m + Wk,lWk,m + K^kdWk^m - Wk,lVk,ni)]} 
K 



+ 



acs{k) 



^{X + Kf 
^{X + Kf 



Y, XlXr, 



Jk{ai-0i-ara) 



{Vk,m - iWk,, 



l,m=0 
K 



Y^ XiXn,e'' 



k{e^+ai-a^) 



(Vfc,i + \Wk,l 



l,m=0 



(12) 



The functional ||Ci(q;i)||^ can be split into a stochastic part which depends on the random variables 
{Vk,!, k = — ^^ • • • ^^} and {Wk,i, k = — ^^ • . . ^^}, and a noise-free part which does not 



depends on them, and is further on denoted by Di{ai). Observe that the first sum in (J^i is equal to 
|cs(A:)p when taking ai = Oi; consequently, all the terms stemming from the first and the third sum 



n— 1 n—1 



in ( [To] ) depend on the random variables {14,;, k 

and are not part of the functional L'i(a!i). This term is equal to: 



} and {Wk,i, k 



n—l n—1 



}> 



(13) 



Yi^k\cs{k)\^ 



fce/c 



1 ^ 



K + X 



in=0 



1 



Details of the calculations are given in Appendix -A Note that due to ( [13] ), Di has a unique global 
minimum which is attained when am = Om, for all ?ti = 1 . . . , K, that is the actual shift value. We 
show in Proposition 3.1 that ||Ci(q;i)||^ — Di{ai) is negligible when both n and K tend to infinity. 
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under mild assumptions on A, so that the proposed cost function behaves asymptotically like L'i(q!i). 

Proposition 3.1: Assume that K — )• oo, n — )• oo, A — )• oo, and \/K — )• 0. Denote the noise part 
associated with Bi{k,Oi) — Bi{k,ai) by R{k,ai), that is 



Rik,ai) = Bi{k,ei) - Bi{k,ai) 



\csm' 



1 ^ 



fc(«,„-e™) 



K + X 



Then: 



J^ Uk{Am{k) - Bi{k, 9i)f= Op (^) + Of 

k&K V"- / 



m=0 
1 



+ Op(-')inf^^— - ^KX 



n) c K + \ 



a 



m ^m 



m=0 



Y,Vk{A^{k) - Bi{k,ei)) 
keK. 

X {Biik,ai)-B,{k,ei)) 



(14) 



Of 



'^+0 '' 



n^ 



nK 



where the Op hold uniformly in ai. 



Proof: See Appendix -B 



Since Oi is the minimizer of Ci and Di{6i) = 0, we get by means of Proposition 

Z)i(0i) = ||Ci(0i)||2 + (Z)i(^i) - ||Ci(0i)||2) 

<\\c,{e,)\\l + {D,{e,) -\\c,{e,)\\l) 
-[D,{e,)-\\c,{er)\\l) 



3.1 



that 



{D,{e,)-\\c,{e,)\\l] 



(15) 



[Di{6^] 



Of 



n 
+ Op(-linf 



^l+'^^^nKj 



Ci{ei)\\l) 



1 1 ^ 

( ~ ) ™f T^ I X y~! -^m(^m - 61^ - c) 

Vn/ c A + A ■^^ 



m=0 



thus showing that Di{Oi) is close to zero as both n and K tend to infinity. The main result is using 
the fact that the only minimizer of Di is the true vector of shifts. 
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C. Theoretical properties of the shift estimation algorithm 

The following result gives information on the number of curves well aligned in a given block, and 



holds for each term in the sum of Equation ( 13 1. 

Proposition 3.2: Let r] — )• as n, ii' — )• oo, A > 1 and let 6 he a real positive number. Suppose 
that ai, . . . , am is any sequence such that: 

1 



K 



(K + X) 



m=0 



Jk(e,„-a,„) 



> l-rj 



for some k £ IC. Then there exist two positive constants 70 and Kq, such that for K > Kq, there is 
a constant c such that the number of curves whose alignment error am — Sm — c is bigger than rj^, 
is bounded by 70 (i^ + X)r]^^'^^. Moreover, 

(K + X)rj 



K 



T.((^r, 



ar 



c?< 



m=l 



7oA;2 



(16) 



Proof: See Appendix -C 



Proposition 3.2 has the following motivation: when the number of curves in each block is large 



enough, the noise contribution to the criterion will be small, and di will be such that the condition 
of the proposition holds. Hence, we can conclude that most curves will tend to align. However, they 
may not align with the reference curve yo- Consequently, the weighting factor A is introduced in order 
to "force" all the curves in a block to align with respect to yo, as stated in the following proposition: 



Proposition 3.3: Assume that A is an integer, and that 77 < X/{'jo{K + A)), where 70 is the 



positive constant appearing in the previous proposition. Then, under the assumption of Proposition 3.2 



we get that \c\ < rj 



Proof: See Appendix -D 



In other words, when A is chosen such that A — )• 00 and X/K — )• as ET — )• 00, the estimate will 
be close to the actual shifts. We now state the main theorem: 

00, n -^ 00, A = X{K) — ^ 00, n^/^X/K — > 



Theorem 3.1: Under Assumptions (H-l)-(H-3) 



ifK 



0, and n/K is bounded, then for all 5 E (0,1/2), there exists 7 > 0, such that with probability 
converging to 1 



1 ^ 

— - Y, l{\em - em\ > 2n-') < 7n-(i-2^). 



K + X 



(17) 



m=0 



Proof: All along the proof, 71 , 72 , . . . denotes a sequence of positive constants such that the 



following inequalities hold. The proof of this theorem is deduced from ( [T5| ) and Propositions 3.2 and 
3.3 By ( [T3] ) and ( [T5] ) we can use Proposition 3.2 with 



71 _^ Ji 

n Jn \K + A 



V^^'T.^^ 



1/2 



(18) 



m=0 
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Since 6m and 9m are bounded, we obtain that r] = op(l). Define 

K 



A'^inf- 



c K + X 



^E' 



-'m "m 



m=0 



Equation ([16]) in Proposition 3.2 yields 



A^<^-^ + ^A 



n ^n 
From this quadratic inequality we obtain that A < (72 + \/^2 + '^l\)l \fn, thus: 



c K + X 



1 ^ 



m=0 



73 
n 



(19) 



which shows that 77 < 74/n. On the other hand, by ( [T3| ), ( [T5] ), and Proposition 3.3 we conclude that 
with probability converging to 1: 



1 ^ 



K + X 



-'m '-"m 



> 2ri^) < 75?? 



1-2(5 



(20) 



m=0 



and due to ( [T9| ), pO] ) still holds when replacing t/ by 74/n, thus proving ( [T7| ) and the theorem. In 
particular, letting 6 be as close to 1/2 as needed shows that the estimator 61 tends to 61 with the 
standard rate of convergence n^^'^. ■ 



D. Weak convergence of the density estimator 

Due to the previous results, it is now possible to give a theoretical result about the plug-in estimate 
of the distribution of 6. As suggested in ([8]l, an estimate of the probability density function / can 
be obtained by plugging the approximated values of the shifts into a known density estimate. We 
provide here a result on the weak convergence of the empirical estimator. 

Theorem 3.2: Let g he a continuous function with a bounded derivative. Under the assumptions 



of Theorem 3.1 we get almost surely when M — ;■ 00, n — )• 00 that 



1 ^^ 

m=0 



(21) 



Proof of theorem 3.2 can be sketched as follows: due to the Law of Large Numbers, it is equivalent 
to show that: 



M 



1 *^ 

k=0 



converges almost surely to 0. Since g has a bounded derivative, we can write that the absolute value 
of the latter term is bounded by 

sup^, \g'{x) 



M 

M + l ^ ' 

m=0 



Consequently, due to Theorem 3.1 there exists a constant C such that with probability: 



M + 



1 ^^ 



g{6m)) < C 



m=0 



1 



n" 



+ 



1 



n(i-2S) 
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which completes the proof. More particularly, taking g{ 
E [ip (^7^)], thus showing pointwise consistency, that is 



'^{'~Fr)' ^^ s^*^ ^^"^ ® ^^""^^^ to 



fM,h{x) — > f{x) as M ^ oo,h^0 , 

for any continuity point x of /. 

Remark 3.2: If n remains bounded as ivT —;■ oo, then the parameters 6m cannot be estimated without 
an error, and the observed distribution of {6m} would be a convolution of the distribution of {9m} 
with the estimation error. If n is large enough, the latter distribution is approximately normal with 
variance which is Op (cr^/n). 

Remark 3.3: The discussion was under the assumption that the di,...,d]\j have a continuous 
distribution with a smooth density. If this would not be the case, then the estimated density will be 
approximately equal to a smoothed version of the distribution. 

IV. Applications 
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(c) 



Fig. 2. Results for K=200 and a^ — 0.1; (a) two curves before alignment, (b) comparison between estimated against actual 
values (blue dots) of the shifts for A = 50: good estimates must be close to the identity line (red curve), (c) comparison 
between estimated and actual values of the shifts for A = 10. 



We present in this section results based on simulations and real data. Since we provide a generic 
method suitable for most biological signals, we focus in our simulations on a neuroscience model, 
while our real datasets stem from the ECG framework. In the latter case, we compare our method 
to the one described in 1 1 ] which is often used by practitioners, that is a measure of fit based on the 
squared distance between the average pulse and the shifted pulses leading to a standard Least Square 
Estimate of the shifts. We present in our simulations results for several values of K. However, a 
method for choosing automatically the parameter K has been suggested in ||20|| : since the term 
^A/(^) can be built iteratively and converges when the number of curves tends to infinity due to 



August 16, 2010 



DRAFT 



IEEE TRANSACTIONS IN SIGNAL PROCESSING 16 



Remark |3.H K can be chosen such that 



keK. \ '^ 1=0 J 



where e is a precision threshold fixed by the user. It is however obvious that the optimal choice of K 
should depend on the functional properties of the signal s, which are unknown in a semiparametric 
framework. 

A. Simulations results 

Using simulations we can study the influence of the parameters K and A empirically by providing 
the Mean Integrated Squared Error (MISE) for different values of K and (t^. We use a fixed number 
of blocks A^ = 20. The weighting parameter is chosen as A = [K^], where < /3 < 1. Choosing /3 
close to 1 enables us to align the curves of a given block with respect to the reference curve. 

1) Experimental protocol: Simulated data are created according to the discrete model Q, and 
we compute the estimators for different values of the parameters K, A and a"^. For each curve, we 
sample 512 points equally spaced on the interval [0;27r]. We make the experiment with s computed 
according to the standard Hodgkin-Huxley model for a neural response. The shifts are drawn from a 
uniform distribution Z^(1207r/256, 3257r/256), and 6*0 = vr. 

2) Results: We present in Figure [2] results obtained using the alignment procedure, in the case 
of high noise level (cr^ = 0.1). We also compare our estimations with those obtained with an 
existing method, namely curve alignment according to the comparison between each curve to the 
mean curve |1J. Results using landmark alignment are displayed in Figure [4] We observe that the 
efficiency of this approach is less than our estimate achieves with A = 50, Figure |2]-(b), but is 
better than the estimate with A = 10, Figure [2]-(c). An example of density estimation is displayed in 
Figure |3j using a Gaussian kernel. It should be noticed, however, that /i is a free parameter which 
may exhibit a strong influence on the resulting estimate. A too small value of h leads to over-fitting, 
whereas taking large values of h leads to hide the multimodality of /, if any. Choosing a data-driven 
bandwidth selection of h is thus far from trivial, and out of the scope of this paper: we refer to [21], 
|[22l for an in-depth description of the existing procedures. The bandwidth h is chosen by Silverman's 
"rule-of-thumb" |[23l . We retrieve the uniform distribution of 6. Table |l] shows the estimated MISE 
for different values of K and a'^, with A = [K^'^] and N = 100 blocks. The first given number is 
the value for our estimate, while the second is for the estimator of HJ. Note the dominance of the 
proposed estimator in all cases, in particular for the more noisy situations. 

B. Results on real data 

We now compare the estimated average aligned signal of the two methods applied to ECG signals. 
The data was obtained from the Hadassah Ein-Karem hospital. 
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Fig. 3. Probability density estimation for A^ = 20, K — 200 and a 



0.1. 



a^ 


K=10 


K=20 


K=30 


K=50 


K=100 





0.0305 
0.0306 


0.0228 
0.0234 


0.0198 
0.0199 


0.0153 
0.0156 


0.0106 
0.0109 


10-* 


0.0312 
0.0325 


0.0218 
0.0232 


0.0183 
0.0212 


0.0156 
0.0183 


0.0121 
0.0158 


10-2 


0.0296 
0.0306 


0.0218 
0.0232 


0.0172 
0.0192 


0.0143 
0.0172 


0.0120 
0.0143 


1 


0.0326 
0.0547 


0.0274 
0.0806 


0.0248 
0.0514 


0.0255 
0.0553 


0.0288 
0.0741 



TABLE I 
The MISE of the two density estimates. 



1) Experimental protocol: In order to obtain a series of heart cycles, we first make a preliminary 
segmentation using the method of |f6l|, namely alignment according to the local maxima of the heart 
cycle. We then apply our method, and compare it to the alignment obtained by comparing the mean 
curve to a shifted curve one at a time. We took in this example i^ = 30 and A = K^-"^^. 

2) Results: The results are presented in Figure |5] Comparison of Figures 5(c) and 5(d) shows that 
the proposed method outperforms the standard one. Moreover, when computing the average of the 
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Fig. 4. Shift estimation using Least Square Estimate (see fTl) for one block. 



reshifted heart cycle, we observe that our method allows to separate more efficiently the different 
parts of the heart cycle; indeed, the separation between the P-wave, the QRS-complex and the T-wave 



are much more visible, as it can be seen by comparing the average signals obtained in Figure 5(a) 



and Figure 5(b) 



C. Influence of ECG perturbations on the proposed algorithm 

As we saw, the model fits reasonably well the data we have at hand, and in fact perform better 
than the competing algorithm. The ideal model may not fit other data sets in which the shape of the 
heart pulse changes, or additional perturbations occur. Although no estimation procedure can operate 
under any possible distortion of the data, we now show that our procedure is quite robust against the 
main type of potential distortions. The main type of perturbations related to the processing of ECG 
data are of four kinds (cf. Il24l ): 

• the baseline wandering effect, which can be modeled by the addition of a very low-frequency 
curve. 

• 50 or 60 Hz power-line interference, corresponding to the addition of an amplitude and frequency 
varying sinusoid. 

• Electromyogram (EMG), which is an electric signal caused by the muscle motion during effort 
test. 
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(a) Aligned heart cycles and average signal (black 
dotted curve) using the standard method 



(b) Aligned heart cycles and average signal (black 
dotted curve) using the proposed method 
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(c) Aligned heart cycles using the standard (d) Aligned heart cycles using the proposed 
method, zoom for the first 30 curves method, zoom for the first 30 curves 

Fig. 5. Comparison between the state-of-the-art and the proposed method for the alignment of heart cycles (arbitrary 
units). A semiparametric approach appears more appealing to align cycles according to their starting point, and allows to 
separate more efficiently to P-wave, the QRS complex and the T-wave. 



• Motion artifact, which comes from the variation of electrode-skin contact impedance produced 
by electrode movement during effort test. 

To keep the discussion within the scope of the paper, we chose to focus on two perturbations, 
namely the baseline wander effect and the power-line interference effect. We present in Figure [6] the 
effect of baseline wander on the proposed algorithm. This effect was simulated by the addition of a 
low-frequency sine to the ECG measurements. We took here N = 100, K = 100, A = K^'^ . 

We observe that the proposed curve alignment algorithm is robust regarding this kind of pertur- 
bations, since we observe well-aligning curves and very little change on the average pulse shape 
compared to the one obtained without this perturbation. This can be interpreted as follows: since the 
baseline is in this situation a zero-mean process, the averaging which is done while computing the 
cost function naturally tends to cancel the baseline. However, we remark that the baseline wander 
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(a) 



(b) 



Fig. 6. Effect of the baseline wander phenomenon over the proposed curve alignment method: distorted signal (a), and 
aligned pulses with the average ECG pulse obtained for one block (b) 



phenomenon can cripple the preUminary segmentation, if the amplitude of the baseline is too high. 
This problem can be easily circumvented by means of a baseline reduction prefiltering, such as 
proposed in ||24l. 1251. l|26l. 

We now consider the problem of power- line interference. In order to artificially simulate the original 
signal with a simulation of the power-line interference, we used the model described in |[27l . that is 
we add to the ECG signal the following discrete perturbation: 

y[n] = {AQ + U[n])sin{ f "^j ' 

where Aq is the average amplitude of the interference, /o its frequency, fs the sampling frequency 
of the signal and ^a [n] , Cf [n] are white Gaussian processes used to illustrate possible changes of 
the amplitude and frequency of the interference. The results of the curve alignment procedure are 
presented in Figure |7j for a similar choice of A^, K and A. 

As shown in Figure [7] the proposed algorithm is robust for this kind of distortion, as we retrieve 
about the same average signal after alignment of the curves. It shall be noted, once again, that this 
kind of perturbation can interfere with the segmentation procedure, and that for interferences with 
high amplitude, a prefiltering step as described in ll28l . ll29ll . ll30ll could be applied. Both results 
illustrate the robustness of semiparametric methods for curve alignment, when compared to standard 
FDA analysis. We now apply the proposed algorithm to a real ECG signal displayed in Figure [8j 
which is distorted by power-line interference and baseline wander. After a preliminary segmentation, 
we get the individual pulses displayed in Figure |9] The aligned curves and the obtained average signal 



are presented in Figure 10 It can be noted that the proposed method still performs well and is robust 
to aformentioned perturbations. The average signal obtained by the proposed algorithm is therefore 
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Fig. 7. Effect of the power-line interference phenomenon over the proposed curve ahgnment method: distorted signal (a), 
and aligned pulses with the average ECG pulse obtained for one block (b) 
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Fig. 8. ECG signal with real baseline wander and power-line interference (partial) 



more representative. 



D. Discussion 

Figures |2];b) and |2jc) are a good illustration of Proposition 3.2 Figure |2jc) shows that when A 
is too small, the curves are well aligned within the blocks, but blocks have different constant shifts. 
Taking a larger A addresses this problem, as it can be seen in Figure |2]^b). Our proposed method 



Auguiit 16, 2010 



DRAFT 



IEEE TRANSACTIONS IN SIGNAL PROCESSING 



22 



400 



300 - 



200 



100 



-100 



-200 




20 40 60 80 100 120 140 160 180 200 



Fig. 9. Obtained curves before the curve alignment procedure and associated average signal (dotted). 
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Fig. 10. Aligned curves by means of the proposed method, and average curve (dotted). 
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uses all the available information and not only the information contained in the neighborhood of the 
landmarks. The advantage of our method is evident with noisy curves, when locating the maximum 
of each curve is very difficult. 

Not surprisingly, the number of curves in each block K may be low if the noise variance remains 
very small (first column of Table [I]), the limiting case K = 2 consisting in aligning the curves 
individually. Theoretically, K should be taken as large as possible. However, this comes with a price, 
the largest the K the more difficult is the optimization problem. 

V. Conclusion 

We proposed in this paper a method for curve alignment and density estimation of the shifts, 
based on an M-estimation procedure on a functional of the power spectrum density. The proposed 
estimator, deduced from blocks of signals of size K, showed good performances in simulations, even 
when the noise variance is high. On real ECG data, the proposed method outperforms the functional 
data analysis method, thus leading to a more meaningful average signal, which is of interest for the 
study of some cardiac arrhythmia. Investigations of the associated kernel estimates, with emphasis 
on rates of convergence, should appear in a future contribution. 
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A. Computation of the noise-free part 



If the curves are perfectly aligned, that is if cxi = 6i, equation ( [121 ) becomes 

|c.(fc)P 
(A + K)2 



12 K 



^'(*.''') = 7TT^E^'^- 



l,m=0 

+ ;^i:*A„.(e'"--'. (22, 

^ ^ l,m=0 

[Vk,lVk,m + Wk,lWk,m + i(VfcjWfc,m " Wk^lVk-m)]} 
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Equation ( [12] ) can also be expanded, in order to find an equation close to ([TTjI. We find after some 
calculations that 



Biik,ei 



+ 



\Cs{k)\'' + 



2Xa 



a 



K 



n{\ + K)'' 



K 



Y^xKvli + wli 



1=0 



— -— — 2 Re{ V e''^'' [Vk,iVk,o + Wk,iWk,^ 



+ m,iWk,o-Wk,iVk,o)]} 

^ ^ l<l<m<K 

K 



(23) 



Collecting equations ( [TT] ), ([12]) and ([23]l, we can check easily that the only noise-free part comes 



from the second sum in (10 1, and is equal to Di{(Xi 
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B. Proof of Proposition 3.1 



Using Equations ( [TT] ) and (23 1, we get that for all k the deterministic part of AM{k) — Bi{k,9i) 
vanishes, leading to 

^2 



2 ^ 

AMik)-B,{k,9,) = .j^/^. E(Ki + K 



(M + l)n^^ 



a 



K 



Y.^Kvli + wli 



2Acj2 ^ 

— -— — 2 Re{ V e'^^' [Vk,iVkfl + Wk,iWk,o 

+ -AyKiWkfl-Wk,iVk,o)\} 



2a' 



n{\ + KY 



+ 



Re{ Y. ^'''^''~''^\yKiVk,m + Wk,iWk,; 



l<l<m<K 



i(Vfc,iWfc,m - Wk,iVk^m)]} 



^ '^ 1=0 



i=0 

VniX + K) ■^ 
All the above sums are of i.i.d. random variables, with mean zero (except for the first two sums), and 
all have sub-Gaussian tails. Consequently, there is a constant D, independent of k, such that when 

K -)- oo, A -)- oo and X/K — ;> 0: 



a 



a^ cr\csik)\ 



\AM{k) - Bi{k, 6>i)||^, <D{— + —- + 



n nK WnK 

where for any random variable X, \\X\\fj,^ = y^E{X'^). Hereafter, D is the same constant, large 
enough to keep all the inequalities valid. From the latter inequality, we get that: 



j; i.k{AMik) - B,{k, 9^))'= Op (^) + Op (^ 



We now study the term R{k,ai), that is the part of Bi{k,9i) — Bi{k,ai) which depends on 



the random variables V and W, using their expression in ( (12| ) and (22 1. We get that R{k,ai) 
I + II + III, where 



O" 



n{\ + KY 



K 



Y.\i^'"''{Vk,i + lWk, 



1=0 
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11 



a 



n{\ + KY 



K 



j;V=''(Vfc,/ + iw-fe„ 



1=0 



and 



m^2Re{- ^^(^)^ 



K 



n{\ + Kf 



7 ^ A^A^ 



/,m=0 



(e 



ifc(oi-6'i-o„) 



-iA:6'„ 



)(Vfe,m -iWfc,m)} 



According to Cauchy-Schwarz inequality for Hermitian products: 

,2 



I < 



a 



n 






Jikai -ikai 



K 



1=0 / 



Since the first term in parentheses is equal to 1 and the second is bounded in probability by Markov's 
inequality, we get that / = Op(l/n). // is a sum of independent random variables with zero mean 



and bounded variance, hence ||//||^2 ^ Da"^ /nK. Finally, define U} 



k,m 6 



ik0„ 



iVk,m - ^Wk,m); we 



get for any real number c that 

\III\ 

2\cs{k)\a 



K K 



< 



2\cs{k)\(j I 
n{\ + K) 



i=0 m=0 
K 
/ ^ Amf/; 



ik{ai-ei-c-a^+e^+c) 



l]Uk, 



k,m 



m=0 



Since the random variables {Uk,m,rn = . . . i^} are independent and identically distributed with 
mean zero and finite variances, we get that 



K 



Y, XmUk,m = Or{K'/^). 



m=0 



(24) 
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Consequently, using ( [24| ) and Cauchy-Schwarz inequality, we obtain: 

\III\ 

K K 



< 



2\cs{k)\a 
n{X + Ky 



1=0 m=0 

1 



+ 0f 



K 



^ • — I / ^ ^mt: ^ k,rn 



< 



n{X + K) 
Cs(fc)|o- 



m=0 



/ni? 



+ 0f 



/nZ 



Y^ Xm{e-' 



k(a,„-9^-c) 



l)Uk, 



m=0 



+ 0f 



^nJ{ 



^|c.(fc)k^^J_^^^ 



n Vi^ + A 



'ifc(a„-e,„-c) _ ]^|2 



1/2 



m=0 
K 



—T 2^ Am|f7fe,: 



K + A 



+ Up 



m=0 



X 



Op(i)^ii^f^^A^(a^-0^-c)^ 



1/2 



^ \K + X 



•m=0 



+ 0f 



1 



^nJ{ 



Recall that Xli/eA: ^^ ^^ bounded. Equation (14 1 is obtained if all the bounds above are collected. 



and Assumption (H-2) is used. Eventually, we can check easily that ||-Bi(A;, Qi) — Bi{k, 9i)\\f^.-^ < oo, 
and obtain the last equality of ([14]) by means of Cauchy-Schwarz inequality. 



C. Proof of Proposition 3.2 



Observe that there exists 70 in (0, 1) such that, for all x in [— 7r,7r], we have cosx < 1 — 70^;^. 
Since we have 



^j^ _^X) ^ \iexp{[k{em-am)) 



<1, 



0<m<K 

then there exists, according to the assumption, two constants Kq > and c such that, for K > Kq 
and every k, we have 



Re Yj Am exp (i/c (9^ - am)) 



(K + X) 



m=0 



> 1 - ?? . (25) 
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Hence: 

K 



1 - ?7 < y^ Am cos{k{6m -am- c)) 



-fC + A 



m=0 
K 



< J. , . y^ Am(l -Jok'^iOm -am "C)^), 
A + A — ' 

m=U 

and ( [T6] ) follows. Denote by A^ the number of curves in the block whose alignment error is "far" 
from c (up to a 27r factor): 

K 
N= ^l[\em-am-c\>V^] , 
m=0 



and assume, for simplicity, that the N last curves are the misaligned curves. Equation (25) implies 

K-N-l 

l-r]< y^ Xm cos{k{em -am- c)) 



K + X 

in=0 



1 ^ 

+ ^ . y^ Xm COs{k{9m -am- c)) 



Equation ( 26 1 leads to 



K + X 

m=K-N 
K + X-N N , ,os 2S^ 

= 1 - T^Tofc'^S^' . (26) 



-^^"'- 



which completes the proof. 



D. Proof of Proposition 3.3 

Assume that \c\ > rj^; since A is assumed to be an integer, we can see this weighting parameter as 
the artificial addition of A — 1 reference curves. Since ao = ^o = 0, in that case, [Oq — ao — c\ > rj^, 

thus giving 

N X 1 ox 



K+X K+X 



which would contradict Proposition 3.2 Therefore, we get that \c\ < rj' 



References 

[1] B. W. Silveman and J. Ramsay, Functional Data Analysis, 2nd ed. Springer Series in Statistics, 2005. 

[2] F. Ferraty and R Vieu, Nonparametric Functional Data Analysis: Theory and Practice, 1st ed. Springer Series in 

Statistics, 2006. 
[3] J. O. Ramsay, "Estimating Smooth Monotone Functions," Journal of the Royal Statistical Society Series B, vol. 60, 

no. 2, pp. 365-375, 1998. 
[4] J. O. Ramsay and X. Li, "Curve Registration," Journal of the Royal Statistical Society Series B, vol. 60, no. 2, pp. 

351-363, 1998. 

August 16, 2010 DRAFT 



IEEE TRANSACTIONS IN SIGNAL PROCESSING 29 

[5] B. Ronn, "Nonparametric Maximum Likelihood Estimation for Shifted Curves," Journal of the Royal Statistical Society 

Series B, vol. 63, no. 2, pp. 243-259, 2001. 
[6] T. Gasser and A. Kneip, "Searching for Structure in Curve Sample," Journal of the Amerian Statistical Association, 

vol. 90, no. 432, pp. 1179-1188, 1995. 
[7] A. Kneip and T. Gasser, "Statistical Tools to Analyze Data Representing a Sample of Curves," Annals of Statistics, 

vol. 20, no. 3, pp. 1266-1305, 1992. 
[8] M. Pawlak, Z. Hasiewicz, and P. Wachel, "On nonparametric identification of wiener systems," IEEE Transactions on 

Signal Processing, vol. 55, no. 2, pp. 482^92, 2007. 
[9] W. Greblicki and M. Pawlak, Nonparametric System Identification. Cambridge University Press, Jun. 2008. 
[10] Z. Hasiewicz and G. Mzyk, "Hammerstein system identification by non-parametric instrumental variables," Interna- 
tional Journal of Control, vol. 82, no. 3, p. 440, 2009. 
[11] P. Bickel, C. Klaassen, Y. Ritov, and J. Wellner, Efficient and Adaptive Estimation for Semlparametrlc Models, 1st ed. 

Springer, May 1998. 
[12] M. Vimond, "Efficient estimation for a subclass of shape invariant models," The Annals of Statistics, vol. 38, no. 3, 

pp. 1885-1912, 2010. 
[13] I. Castillo and J. Loubes, "Estimation of the distribution of random shifts deformation," Mathematical Methods of 

Statistics, vol. 18, no. 1, pp. 21-42, Mar. 2009. 
[14] F. Gamboa, J.-M. Loubes, and E. Maza, "Semlparametrlc Estimation of Shifts Between Curves," Electronic Journal 

of Statistics, vol. 1, pp. 616-640, 2007. 
[15] M. Lavielle and C. Levy-Leduc, "Semlparametrlc Estimation of the Frequency of Unknown Periodic Functions and 

its Application to Laser Vibrometry Signals," IEEE Transactions In Signal Processing, vol. 53, no. 7, pp. 2306- 2314, 

2005. 
[16] A. C. Guyton and J. E. Hall, Textbook of Medical Physiology, 9th ed. W. H. Saunders, 1996. 
[17] J. Pan and W. Tomkins, "A Real Time QRS Detection Algorithm," IEEE Transactions on Biomedical Engineering, 

vol. 32, no. 3, pp. 230-236, 1985. 
[18] Y. Ritov, "Estimating a Signal with Noisy Nuisance Parameters," Biometrlka, vol. 76, no. 1, pp. 31-37, 1989. 
[19] A. DasGupta, Asymptotic Theory of Statistics and Probability, 1st ed. Springer Text in Statistics, Mar. 2008. 
[20] T. Trigano, U. Isserles, and Y Ritov, "Semlparametrlc Shift Estimation for Alignment of ECG Data," in Proceedings 

of the EUSIPCO Signal Processing Conference, 2008. 
[21] A. Berlinet and L. Devroye, "A comparison of kernel density estimate," Publ. Inst. Statist. Univ. Paris, vol. 38, pp. 

3-59, 1994. 
[22] M. C. Jones, J. S. Marron, and S. J. Sheather, "A brief survey of bandwidth selection for density estimation," Journal 

of the American Statistical Association, vol. 91, no. 433, pp. 401-407, 1996. 
[23] B. Silverman, Density Estimation for Statistics and Data Analysis, 1st ed. Chapman and Hall/CRC, Apr. 1986. 
[24] O. Sayadi and M. B. Shamsollahi, "Multiadaptive Bionic Wavelet Transform: Application to ECG Denoising and 

Baseline Wandering Reduction," EURASIP Journal on Advances In Signal Processing, vol. 2007, pp. 1-11, 2007. 
[25] M. A. Mneimneh, E. E. Yaz, M. T. Johnson, and R. J. Povinelli, "An Adaptive Kalman Filter for Removing Baseline 

Wandering in ECG Signals," Computers In Cardiology, vol. 33, pp. 253-256, 2006. 
[26] B. Mozaffary and M. A. Tinati, "ECG Baseline Wander Elimination using Wavelet Packets," in Proceedings of World 

Academy of Science, Engineering and Technology, vol. 3, 2005. 
[27] L. D. Avendano- Valencia, L. E. Avendano, J. M. Ferrero, and G. Castellanos-Dominguez, "Improvement of an Extended 

Kalman Filter Power Line Interference Suppressor for ECG Signals," Computers In Cardiology, vol. 34, pp. 553-556, 

2007. 



August 16, 2010 DRAFT 



IEEE TRANSACTIONS IN SIGNAL PROCESSING 30 

[28] I. Christov, "Dynamic Powerline Interference Subtraction from Biosignals," Journal of Medical Engineering and 

Technology, vol. 24, no. 4, pp. 169-172, 2000. 
[29] C. Levkov, G. Mihov, R. Ivanov, I. Daskalov, I. Christov, and I. Dotsinsky, "Removal of Power-Line Interference from 

the ECG: a Review of the Subtraction Procedure," BioMedical Engineering OnLine, vol. 4, no. 50, pp. 1-18, 2005. 
[30] A. K. Ziarani and A. Konrad, "A Nonlinear Adaptive Method of Elimination of Power Line Interference in ECG 

Signals," IEEE Transactions on Biomedical Engineering, vol. 49, no. 6, pp. 540-547, 2002. 



August 16, 2010 DRAFT 



